189 research outputs found

    Robust subspace learning for static and dynamic affect and behaviour modelling

    Get PDF
    Machine analysis of human affect and behavior in naturalistic contexts has witnessed a growing attention in the last decade from various disciplines ranging from social and cognitive sciences to machine learning and computer vision. Endowing machines with the ability to seamlessly detect, analyze, model, predict as well as simulate and synthesize manifestations of internal emotional and behavioral states in real-world data is deemed essential for the deployment of next-generation, emotionally- and socially-competent human-centered interfaces. In this thesis, we are primarily motivated by the problem of modeling, recognizing and predicting spontaneous expressions of non-verbal human affect and behavior manifested through either low-level facial attributes in static images or high-level semantic events in image sequences. Both visual data and annotations of naturalistic affect and behavior naturally contain noisy measurements of unbounded magnitude at random locations, commonly referred to as ‘outliers’. We present here machine learning methods that are robust to such gross, sparse noise. First, we deal with static analysis of face images, viewing the latter as a superposition of mutually-incoherent, low-complexity components corresponding to facial attributes, such as facial identity, expressions and activation of atomic facial muscle actions. We develop a robust, discriminant dictionary learning framework to extract these components from grossly corrupted training data and combine it with sparse representation to recognize the associated attributes. We demonstrate that our framework can jointly address interrelated classification tasks such as face and facial expression recognition. Inspired by the well-documented importance of the temporal aspect in perceiving affect and behavior, we direct the bulk of our research efforts into continuous-time modeling of dimensional affect and social behavior. Having identified a gap in the literature which is the lack of data containing annotations of social attitudes in continuous time and scale, we first curate a new audio-visual database of multi-party conversations from political debates annotated frame-by-frame in terms of real-valued conflict intensity and use it to conduct the first study on continuous-time conflict intensity estimation. Our experimental findings corroborate previous evidence indicating the inability of existing classifiers in capturing the hidden temporal structures of affective and behavioral displays. We present here a novel dynamic behavior analysis framework which models temporal dynamics in an explicit way, based on the natural assumption that continuous- time annotations of smoothly-varying affect or behavior can be viewed as outputs of a low-complexity linear dynamical system when behavioral cues (features) act as system inputs. A novel robust structured rank minimization framework is proposed to estimate the system parameters in the presence of gross corruptions and partially missing data. Experiments on prediction of dimensional conflict and affect as well as multi-object tracking from detection validate the effectiveness of our predictive framework and demonstrate that for the first time that complex human behavior and affect can be learned and predicted based on small training sets of person(s)-specific observations.Open Acces

    Discriminant incoherent component analysis

    Get PDF
    Face images convey rich information which can be perceived as a superposition of low-complexity components associated with attributes, such as facial identity, expressions, and activation of facial action units (AUs). For instance, low-rank components characterizing neutral facial images are associated with identity, while sparse components capturing non-rigid deformations occurring in certain face regions reveal expressions and AU activations. In this paper, the discriminant incoherent component analysis (DICA) is proposed in order to extract low-complexity components, corresponding to facial attributes, which are mutually incoherent among different classes (e.g., identity, expression, and AU activation) from training data, even in the presence of gross sparse errors. To this end, a suitable optimization problem, involving the minimization of nuclear-and l1 -norm, is solved. Having found an ensemble of class-specific incoherent components by the DICA, an unseen (test) image is expressed as a group-sparse linear combination of these components, where the non-zero coefficients reveal the class(es) of the respective facial attribute(s) that it belongs to. The performance of the DICA is experimentally assessed on both synthetic and real-world data. Emphasis is placed on face analysis tasks, namely, joint face and expression recognition, face recognition under varying percentages of training data corruption, subject-independent expression recognition, and AU detection by conducting experiments on four data sets. The proposed method outperforms all the methods that are compared with all the tasks and experimental settings
    • 

    corecore